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ABSTRACT 

DevR regulon function is believed to be crucial for 
the survival of Mycobacterium tuberculosis during 
dormancy. In this study, we undertook a compre- 
hensive analysis of the DevR regulon. All the 
regulon promoters were assigned to four classes 
based on the number of DevR binding sites (Dev 
boxes). A minimum of two boxes are essential for 
complete interaction and their tandem arrangement 
is an architectural hallmark at all promoters. Initial 
interaction of DevR with the conserved box is es- 
sential for its cooperative binding to adjacent sites 
bearing low to very poor sequence conservation 
and is the universal mechanism underlying DevR- 
mediated transcriptional induction. The functional 
importance of tandem arrangement was established 
by analyzing promoter variants harboring Dev boxes 
with altered spacing. Conserved sequence logos 
were generated from 47 binding sequences which 
included 24 newly discovered Dev boxes. In each 
half site of an 18-bp binding motif, G5 and C7 are 
essential for DevR binding. Finally, we show that 
DevR regulon induction occurs in a temporal manner 
and genes that are induced early are also usually 
powerfully induced. The information theory-based 
approach along with binding and temporal expres- 
sion studies provide us with comprehensive insights 
into the complex pattern of DevR regulon activation. 

INTRODUCTION 

Mycobacterium tuberculosis (Mtb) is one of the most suc- 
cessful pathogens in history which accounted for an 



estimated 1.7 million deaths in 2009 (1) making tubercu- 
losis a global health emergency. The success of Mtb is 
attributed in large measure to its ability to cause and 
sustain a persistent and latent infection, sometimes even 
for decades. Recent studies showed the importance of 
DevR regulon genes as potential markers of latent infec- 
tion (2-4). A clear understanding of the expression pattern 
of DevR regulon genes will facilitate the identification of 
early and late dormancy antigens and potential candidates 
for subunit vaccines against latent tuberculosis. 

Hypoxia, nitric oxide and nutrient starvation are some 
of the conditions which are believed to be associated with 
initiation and maintenance of Mtb dormancy (5-7). 
Carbon monoxide and ascorbic acid have also been impli- 
cated recently in dormancy adaptation (8-10). Hypoxia, 
nitric oxide, carbon monoxide and ascorbic acid signals 
induce a set of ~48 genes via the DevRS two-component 
system [also called DosRS (8-11)]. Ascorbic acid mimics 
multiple intracellular stresses and exerts wide-ranging 
effects on Mtb gene expression, including induction of 
the DevR regulon. The modulation of gene expression is 
accompanied by growth arrest and a 'dormant' drug- 
tolerant phenotype under in vitro and ex vivo conditions 
(10). Due to its importance in virulence and dormancy 
(12-18), DevRS is undoubtedly the best characterized 
two-component system of Mtb. In silico analysis of 
DevR regulon promoters revealed the presence of one or 
more copies of a consensus binding sequence located in 
the upstream promoter regions of target genes (11). We 
showed that activated (phosphorylated) DevR binds co- 
operatively to specific DNA sequences (Dev boxes) to ac- 
tivate target gene expression (19-21). Although we have a 
fair idea of the mechanism of DevR-mediated transcrip- 
tion regulation (19-21), our understanding is incomplete 
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due to the absence of experimental evidence of all con- 
stituent promoters' structures. 

The crystal structure of DevRc-DNA complex has been 
solved (22). In this structure, a dimer of the DevR 
C-terminal domain (DevRc) interacts with G4G5G6A7 
CgTg motif present in each half of a pahndromic consen- 
sus sequence. However, the relative importance of each 
of these nucleotides in DevR interaction is not known 
which is crucial for understanding DevR-DNA inter- 
action in the natural genomic context wherein consensus 
sites are absent and moreover, target genes are expressed 
at various levels (11). We recently showed that the 
N-terminal domain of DevR plays a decisive role in co- 
operative binding of DevR to DNA and in supporting 
robust gene induction particularly during hypoxia (23). 
These observations emphasize the relevance of examining 
DevR regulon activation mechanisms in the context of 
full-length phosphorylated DevR-DNA interaction 
(rather than with DevRc). 

We recently showed there was a disparity between the 
number of predicted binding sites (11) and experimentally 
determined DevR binding sites in two regulon members, 
namely, tgsl-Rv3131 and fdxA genes. Moreover, the newly 
discovered binding sites were functionally important 
(21,24). This raised a need to examine other DevR target 
promoters to gain a comprehensive understanding of 
regulon activation mechanisms. Here, we experimentally 
determined the promoter structure of all the known target 
genes of this regulon. Forty-seven experimentally deter- 
mined sites were used to generate a sequence logo and 
the functional importance of each nucleotide at critical 
positions was assessed. We show that tandem positioning 
and helical phasing of DevR binding sites are essential 
for cooperative binding and synergistic transcriptional ac- 
tivation. Our Green Fluorescent Protein (GFP) reporter 
data suggests that DevR regulon genes are temporally and 
differentially regulated in a complex manner. 

MATERIALS AND METHODS 

Bacterial strains, plasmids and primers 

Mycobacterium tuberculosis strains were cultured at 37°C 
in Dubos medium containing 0.05% Tween-80 plus 0.5% 
Albumin-0.75% Dextrose-0.085 % NaCl (ADS complex). 
Escherichia coli strains that were used and their culture 
conditions were as described earlier (25). AU the plasmids 
used in this study are hsted in Supplementary Table SI. All 
primers used in the construction of gf'p transcriptional fu- 
sions and in the preparation of DNA fragments for DNase 
I footprinting are listed in Supplementary Table S2. 

Construction of gfp transcriptional fusions 

The putative promoter regions of all DevR regulon genes 
were amplified from M. tuberculosis H37Rv DNA using 
primers (listed in Supplementary Table S2) and cloned 
into the promoterless GFP reporter plasmid, pFPV27 
(26) at the EcoRI site to place the promoters upstream 
of the GFP open reading frame. Dev box mutants and 
variants were generated by the mega primer method of 
mutagenesis as described (27). The sequences of all the 



cloned inserts and mutations were confirmed by DNA 
sequencing. The promoter plasmids were electroporated 
into M. tuberculosis H37Rv and GFP reporter assays 
were carried out as described below. 

GFP reporter assay 

GFP reporter assays were carried out as described previ- 
ously (19). Briefly, Mtb strains carrying various promoter 
constructs were grown simultaneously under aerobic con- 
ditions and subcultured two to three times tiU all the 
strains were growing at apparently similar growth rates 
according to A595 measurements. At A595 = ~0. 3-0.4, 
the cultures were diluted to an Optical Density at 
595 nm = ~0.05 and standing cultures were estabhshed 
by dispensing 200-^x1 culture aliquots in quadruplicate 
into 96-weU black, clear-bottom microtiter plates 
(Becton, Dickinson and Co., UK) and the plates were 
incubated at 37°C. For each time point separate plates 
were used. GFP fluorescence in Mtb cultures was 
measured in a spectroflourimeter using excitation and 
emission wavelengths of 483 and 515 nm, respectively 
and expressed as relative fluorescence units (RFU) per 
unit Optical Density at 595 nm (OD) after correcting for 
background fluorescence of bacteria harboring pFPV27 
control vector which varied between ~30 and 45 RFU/ 
OD. Fold induction is a ratio of RFU/OD of standing 
cultures versus that of aerobic shaking cultures at 48/ 
72 h (without subtracting vector background). 

Gel-shift assay and DNase I footprinting 

DevR protein was purified as described previously (24). 
Phosphorylated DevR was prepared using acetyl phos- 
phate for all interaction experiments because we showed 
previously that unphosphorylated or phosphorylation 
defective protein (DevR D54Vprotein) does not bind to 
DNA in a sequence-specific manner (19). DevRc protein 
(141-217 amino acids of DevR) was expressed and purified 
from pUS-DevRc as described (23) and was generously 
provided by Dr U.S. Gautam. DNase 1 footprinting and 
gel-shift assays were carried out as described previously 
(19,21). The fraction of bound DNA was estimated 
using Quantity One software (Bio-Rad). 

Computational analysis 

For the 25 DevR primary binding sites, the Delila programs 
mkdb, dbbk, catal, delila and alist were used to convert the 
sequences into Delila format and to ahgn them (28,29). An 
information curve was made using encode and rseq, and 
the average shown as a sequence logo using dalvec and 
makelogo. Once the logo was made, the ri program was 
used with the same sequences to create an individual in- 
formation weight matrix. Finally, the makewalker program 
was used to generate the walkers for the primary and sec- 
ondary DevR binding sites. Three numbers are reported in 
the vertical box above or below the zero hne opposite the 
base. The first is the position of the box on the sequence. 
The second is the sequence conservation of the entire 
binding site, given in bits. The third is the Z-score, 
which conveys the probabihty that a particular sequence 
is a member of the sites used to create the matrix. 
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RESULTS 

The DevR regulon comprises of ~48 genes and it plays a 
key role in Mtb adaptation to a variety of environ- 
mental cues including hypoxia, exposure to nitric oxide 
or carbon monoxide or ascorbic acid (6,8-11). These 
genes are arranged singly (e.g. Rv2623, Rv0571c etc.) or 
in clusters and operons (e.g. Rv3134c-devRS, Rv2031c- 
Rv2028c) or are transcribed in a divergent manner 
(e.g. Rvl738-narK2). The regulon comprises some 
well-known genes including the devRS two-component 
system itself, hspX, etc. Although the role of DevR in 
regulating the activity of a few target promoters such 
as devRS, lispX, narK2, tgsl, has been characterized 
(19-21), Uttle is known about the regulatory control of 
most of the DevR regulon that comprises many genes 
of unknown function. Although a wide range of induc- 
tion responses were noted (8,11) and the extent of 
induction appeared to correlate with the number of 
DevR binding sites present in the target promoters (11), 
some aspects were puzzhng. Many strongly induced genes 
such as Rv0079, Rv2628, fdxA, reportedly contain only 
one binding site, while on the other hand, low levels of 
induction was associated with the presence of several 
binding sites in some other instances, e.g. Rvl997, 
Rvl733c, and raises a question regarding the relation- 
ship between number of DevR binding sites on one 
hand and activation response (timing and magnitude) on 
the other hand. Other emergent questions relate to the 
role of DevR in inducing disparate expression of diver- 
gently arranged genes, the role of the newly discovered 
binding sites in cooperative recruitment of DevR to 
weak secondary sites, the relevance of tandem spacing 



of DevR binding sites and the relevance of differences 
in sequence content among primary and secondary 
binding sites. On the basis of previous predictions by stat- 
istical models (30) and occurrence of small intergenic 
regions, many of the DevR regulon genes could be 
arranged in operons. In the present study, a comprehen- 
sive analysis of the DevR regulon comprising of ~43 genes 
transcribed from 27 putative promoters (Supplementary 
Figure SI) was undertaken to answer the questions 
posed above. 

The minimal binding site of DevR 

The Dev box was suggested to be 18- or 20-bp long on the 
basis of in silico analysis (11,31). Toward defining its size 
by experimental means, gel-shift assays were performed 
with double-stranded oligonucleotides of varying lengths 
harboring the primary narK2 binding motif, PI [previous- 
ly named as Dl in (20); Figure 1]. The 20- and 18-bp boxes 
bound with nearly equivalent efficiency to DevR. Further 
shortening of the box to 16- and 14-bp led to progressively 
weaker binding and abrogation of binding, respectively 
(Figure lA and C). Therefore, we conclude that the two 
peripheral nucleotides of the PI box are dispensable for 
binding and in agreement with the in silico prediction of 
an 18-bp long binding motif (31). The substitution of the 
peripheral nucleotides (Figure IB) demonstrated that 
DevR interacts with 16- and 14-bp long motifs but not 
with the 12-bp box (Figure IB and C), suggesting that 
DevR intimately contacts only the core 14-bp sequence 
of the binding site although a minimum of 18-bp is 
required for stable binding. 



DevR-P 036 036 03603 6 (tiM) 




PI -20 



PI -18 



P1-16 



P1-14 




Oligonucleotides used In EMSA 



Name 



narK2 P1-20 
narK2 P1-18 
narK2 P1-16 
narK2 P1-14 
narK2 P1 20-16 
narK2 P1 20-14 
narK2 P1 20-12 



Sequence {5'-3') 



TTAGGGCCGGAAGTCCCCAA 
TAGGGCCGGAAGTCCCCA 
AGGGCCGGAAGTCCCC 
GGGCCGGAAGTCCC 
AG AGGGCCGGAAGTCCCC CT 
AGT GGGCCGGAAGTCCC ACT 
AGTCGGCCGGAAGTCCGACT 



Relative 
DevR 
binding 
activity(%) 



100 



94 



PI 20-16 PI 20-14 PI 20-12 

Figure 1. Size of DevR binding site. (A and B) Electroptioretic mobility shift assay (EMSA) with double-stranded oligonucleotide variants (deletion 
and substitution) of the narK2 PI Dev box (Pl-20). 'f and 'b' refer to free and bound DNA, respectively. (C) Percentage of DevR binding to 
oligonucleotide variants in relation to Pl-20 box at 6)iM protein concentration. For simplicity, only the top strand is shown. PI sequences are 
underlined in Pl-20 substitution variants. 
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The presence of primary and secondary binding sites is a 
universal feature of DevR reguion promoters 

Most of the DevR regulated genes were predicted earlier to 
have a single Dev box in their promoter regions (Table 1). 
One additional binding site each, which was located 
adjacent to the predicted DevR binding site, was dis- 
covered during characterization of the tgsl-Rv3131 
and fdxA promoters. Importantly, the additional site 
was vital for gene expression (21,24). This finding raised 
the possibihty that additional functional cryptic bind- 
ing sites could also be present in other DevR reguion 
promoters. Toward gaining insights into the promoter 
structure of reguion genes, we analyzed DevR binding 
properties at all the putative reguion promoters by 
DNase I footprinting and searched the protected 
DNA regions for 18-bp Dev box or Dev box-like se- 
quences. The most striking characteristic of all the 
reguion promoters was the presence of two to four 
tandemly arranged upstream binding sites. There was 
not a single reguion promoter that featured only one 
DevR binding site (Table 1, Figure 2, Supplementary 
Figure S1-S5). 

We have recently shown that the C-terminal DNA 
binding domain of DevR (DevRc) is defective in coopera- 
tive interaction although it binds with primary binding 
sites (23). In the present study, this property was exploited 
to distinguish between primary and secondary binding 
sites in target promoters. By the criterion that they were 
bound to DevR, but not to DevRc and by computa- 
tional analysis (see below), 22 of the 24 newly discovered 
sites were classified as secondary binding sites (Table 1, 
Figure 2, Supplementary Figure S1-S5). 

Computational analysis of DevR binding sites 

DNase I protected regions of all the reguion promoters 
were analyzed for the presence of DevR binding sequences 
using computational analysis tools (see 'Materials and 
methods' section). The analysis led to the discovery of 
24 new binding sites out of which 22 were poorly con- 
served. Forty-seven experimentally verified DevR 
binding sites including 25 primary sites and 22 secondary 
sites from this study and previous studies (19-21,24), were 
used to generate sequence logos and total information 
content {Rj value) for individual Dev boxes (Figure 3, 
Supplementary Tables S3-S5). As the information 
content {Rj value) for each binding site could be cor- 
related with the binding energy (32,33), we can generally 
assume the Dev box at each promoter with high i?, value 
to be primary (P) and with low Rj value to be second- 
ary site (S). These assumptions are considered to be 
largely valid on the basis of experimental evidence from 
DNase I footprinting analysis (see above). Cy is the most 
conserved nucleotide in the Dev box and it is present in all 
47 sites. Other highly conserved nucleotides are G3, G4 
and G5 in both strands of the primary binding 
site (Figure 3). In contrast, the first and last positions 
in the sequence logo are poorly conserved; i.e. they are 
not information rich. This is consistent with the results 
of gel-shift assay (Figure 1), where shortening of the 
binding motif to 16-bp is tolerated provided the overall 



length is maintained at 20-bp. The results of gel-shift 
assay and computational analysis suggest that a motif 
length of 18 bp is optimal. In the secondary boxes, nucleo- 
tides G4, C7 and G12 are well conserved and have scores of 
1 bit or more (Figure 3). Because DevR does not bind to 
secondary sites in the absence of binding to the pri- 
mary site, interactions with these nucleotides appear to 
enable cooperative recruitment of DevR to second- 
ary binding sites in spite of their low information content. 

The 'sequence walkers' is a graphical method for dis- 
playing how binding proteins interact with individual bases 
of nucleotide sequences. We used makewalker program to 
generate the walkers for aU the DevR primary and second- 
ary binding sites (Figure 4, Supplementary Figure S6). 
This allowed us to determine the contribution of each 
base (positive or negative) to the average sequence con- 
servation of the binding site, as represented by a sequence 
logo. 

Identification of nucleotides critical for DevR binding 

It is evident from the sequence logo that G4, G5 and C7 
nucleotides [previously numbered as G5, Ge and Cg, re- 
spectively in (11,19-22)] are highly conserved in both 
strands of the primary binding sites. Toward analyzing 
the importance of each of these nucleotides in DevR- 
DNA interaction, mutant variants of a well-conserved 
primary binding site {narK2 primary Dev box PI, 
Ri = 20) were assessed for binding to DevR using gel- 
shift assays (Figure 5). In each variant box, the wild- 
type (WT) nucleotide was mutated to the least frequently 
occurring nucleotide at that position (Supplementary 
Table S3). A severe loss in binding was observed on sub- 
stitution at G5 (M-5, 90%) and C7 position (M-7, 93%) 
which indicates that their strong conservation in the 
binding site is of high functional consequence while a 
less-severe binding defect was noted on substitution at 
G3 and G4 positions. A complete loss of DevR binding 
was observed when substitutions were introduced in 
both halves of the Dev box at any one of these four pos- 
itions (M-3 + 3, M-4 + 4, M-5 + 5 and M-7 + 7). Similarly, 
binding was completely abolished when mutations 
were introduced in combination at nucleotide positions 
3 + 7, 4 + 7, 5 + 7, 3 + 5 and 4+5 in one half of the 
Dev box. Binding was not significantly impaired when 
the Dev box was mutated at position 8 (M-8, M-8 + 8). 
Intriguingly, the substitution at nine positions (M-9 + 9) 
resulted in a complete loss of binding. Although the nu- 
cleotide at position 9 is not highly conserved and is usually 
occupied by A or T, it could help in bending of DNA 
during DevR-DNA interaction (discussed later). The 
importance of C7 in binding is again underscored by 
the failure of DevR to bind with the D4 motif at the 
narK2-Rvl738 promoter which bears a natural mutation 
at position C7 (20) and was previously suggested as a 
binding site (11). 

DevR reguion promoters are assigned to four 
architectural classes 

The complete experimental mapping of DevR binding 
sites enabled us to group all the DevR reguion promoters 
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Table 1. Architecture of DevR regulon promoters 
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(C) Class III (4 Dev boxes) 
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'P' 'S' and 'E', primary, secondary and extended DevR binding site, respectively. 
^According to (11,31). 

''Location of sites is indicated in Supplementary Figure SI. 
'^Distance of proximal Dev box from start codon (ATG/GTG). 
TSP, Transcriptional Start Point; ND, not determined. 



into four different classes (Table 1 and Supplementary 
Figures S2-S5). The simplest of them, the Class I pro- 
moters, contain two neighboring Dev boxes in either the 
P-S or S-P arrangement (secondary site proximal to ATG/ 
GTG or primary site proximal to ATG/GTG, respect- 
ively. Figure 2). Rv0571c was an exceptional promoter 
of this class and possesses two binding sites that were 
both designated as primary sites (P1-P2) on the basis of 
their interaction with DevRc (Supplementary Figure S2, 
Supplementary Tables S4 and S5). Promoters containing 
three DevR binding sites are categorized as Class II 
promoters and have P-S-S arrangement {Rv0079, 
Supplementary Figure S3, panel A) or P-S-P configuration 
{Rv 1733c promoter and hspX-Rv2032 promoters. 
Figure 2, Suppleinentary Figure S3, panel B). 

Class ill promoters are those with four tandem Dev 
boxes. Interestingly, the Rv2627c promoter which was ori- 
ginally reported to have a single binding site actually 
contains four Dev boxes that include a primary site 
and three secondary sites (Table 1 and Supplementary 
Figure S4, panel A). Even more interesting among the 



Class III proinoters are the divergent narK2-Rvl738 
genes whose intergenic region is bound by DevR (20). 
Three binding sites, namely, PI, S2 and P2 (earlier 
named as Dl, D2 and D3) were previously shown to be 
required for transcription of both the divergent genes (20) 
and continuous protection over a ~80-bp stretch of 
DNA suggested the presence of an additional site. 
Accordingly, a search was made and a fourth poten- 
tial binding site, named as SI, was detected by careful 
visual inspection and by walking down (using sequence 
walkers) from PI towards S2 (Figure 2, Supplementary 
Figure S4, panel B and Figure 4). The first half (9 bp) of 
the SI box is quite degenerate (G12 T14 C15T16 in lower 
strand), where as the second half is highly conserved 
(G3G4G5C7, Supplementary Figure S4, panel B and 
Figure 4). Mutation in the second half of the pahndrome 
at G4 and C7 nucleotides abohshed DevR binding and 
estabhshed SI as a genuine secondary DevR binding site 
(Supplementary Figure S4, panel B). Interestingly, upon 
mutating the SI site (mut-Sl), binding to the adjacent site, 
PI, was also abohshed at lower protein concentration, but 



Nucleic Acids Research, 2011, Vol. 39, No. 17 7405 



Class I 

Rv2S24c Rv2625c Rv2626c 
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Rv2627c 



Class 11 

Rv1732c Rv1733c Rv1734c Rv1735c 

/" 1— ^ 1 

\ 1 ^ Sj 1 



63 bp 



287 bp 274 bp 




DevRc 



DevR-P DevRc 
AGCT o cM^ (liM) 



S 
P2 

*1 




P1 



Rv2626c 



Rv1733c 



Class III 



Class IV 



narX narK2 Rv1738 Rv1739c 



Rv1813c Rv1814 



286 bp 



DevRc DevR~P 



408 bp 



DevRc 



0 468 10 3 30 (^M) 



u 

Tsp, 



P2/D3 




« *4 




DevR 

binding 

region 




r(|lM) 




narK2-Rv1738 



Rv1813c 



Figure 2. Representive DNase I footprints at promoter regions of Rv2626c (Class I), Rvl733c (Class II), narK2-Rvl738 (Class III) and Rvl813c 
(Class IV). DNase I footprinting analyses of other members of each Class are shown in Supplementary Figure S2-S5. The genomic organization of 
each gene and its operon in Mtb is shown at the top where DevR regulon genes are indicated by black arrows. 'P' and 'S" refer to primary (black 
box) and secondary (white box) Dev box, respectively. Bent arrow indicates the direction and predicted position of translational start codon. Dideoxy 
sequencing reactions using the same primer and DNA template are also shown. The rectangle (black) marked inside the footprint indicates the box 
that was not bound byDevRc. 
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25 DevR primary binding sites 



22 DevR secondary binding sites 




Figure 3. Sequence logos of DevR primary and secondary binding sites. The height of each letter is proportional to the frequency of base and the 
height of the letter stack is conservation in bits at that position. Error bars are shown at the top of the stacks. The sine wave represents the 
accessibihty of a face of DNA (B-form, 10.6 bases of helical pitch) with the major groove centered at positions 4 and 14.6. 



regained at higher concentration which shows that 
DevR binding to these sites is mutually cooperative 
(Supplementary Figure S4, panel B). Both of the second- 
ary sites (SI and S2) at narK2-Rvl738 divergent promoter 
were not protected by DevRc (Figure 2). The binding 
defect at the mutated SI site was associated with a func- 
tional defect too; introduction of the above-described mu- 
tations in pnarK2 GFP reporter plasmid (pnarK2 mut-Sl) 
substantially reduced inducible narK2 promoter activity 
by 78% (from 1208 ± 83 RFU/OD with wt pnarK2 to 
265 ± 57 with pnarK2 mut-Sl). In contrast, only a 
marginal 15% reduction in Rvl738 divergent promoter 
activity was observed with mut-Sl (from 10610 ± 88 
RFU/OD with wt pl738 to 9040 ± 72 with pl738 
mut-Sl). These results conclusively establish the newly 
identified SI box to be genuine DevR binding site and 
to be required for full hypoxic induction of the narK2 
promoter (which is proximal to it) and to some extent of 
the Rvl738 promoter (which is distal to it). We conclude 
that the full induction of narK2-Rvl738 promoters 
requires DevR interaction with all four boxes in the 
intergenic region. 

Class IV regulon promoters have the most complex 
structure. They contain not only primary and secondary 
DevR binding sites but also display an extended DNase I 
protected region referred to as E (Figure 2, Supplementary 
Figure SI). The extended protected regions were always 
located adjacent to a secondary or primary site (Table 1, 
Figure 2, Supplementary Figures SI and S5) and 
were from 13- to 34-bp long in Rvl734c, Rvl813c and 
Rv3127 promoters. Binding to the E regions is Hkely due 
to high cooperative interaction that enables protein 
binding even to highly degenerate sites, as observed pre- 
viously also in case of PhoP in Streptomyces coelicolor 
(34). The sequence walker method was employed to 
detect binding sites in the footprinted regions of these 
DNAs. Interestingly, the E regions in all three promoters 
showed two possible Dev box-hke sequences and they 
possessed the highly conserved nucleotides like G5 and 
C7 (represented by walkers, gray rectangular boxes. 
Figure 4 and Supplementary Figure S6). 



Tandem arrangement of Dev boxes is essential for 
DevR-mediated transcription 

One of the most interesting findings of this study is that 
binding sites are arranged in an adjacent manner in the 
promoter regions of all DevR-regulated genes (with the 
exception of Rv 1733c), suggesting that this arrangement 
holds the key for DevR-mediated gene activation. We 
assessed the importance of spacing by analyzing the 
tgsl-Rv3131 promoters which belong to the Class I 
category. Here, the primary binding site P is placed 3-bp 
apart from the secondary binding site S and in the same 
helical phase (—42.5 and —63.5, Figure 6). We showed 
recently that cooperative DevR interaction with both 
sites is essential to activate divergent transcription (21). 
The functional importance of tandem and helical 
phase arrangement of binding sites was assessed by 
introducing 5, 10 or 15 bp of DNA sequences (correspond- 
ing to 0.5, 1 or 1.5 helical turns. Figure 6A) to alter the 
spacing between the P and S sites. When the P and S boxes 
were out of phase (0.5 and 1.5 turn in pTGS + 5 and 
pTGS+15, respectively), DevR interacted with the P 
box but binding to the S site was abolished (Figure 6B) 
and was accompanied by a drastic reduction of tgsl 
promoter activity to ~9% of WT promoter activity 
and complete abrogation of Rv3131 promoter activity 
(Figure 6C). Restoration of the helical phase between 
P and S sites, i.e. insertion of 10-bp sequence, partially 
rescued cooperative DevR binding (Figure 6B) and also 
promoter activity in pTGS+10 (18%) but not in 
p3131 + 10 (Figure 6C). There are two possible reasons 
for the abrogation of p3131 + 10 activity; first, a failure 
to recruit DevR cooperatively to the TSP-proximal 
box (S site) that is distanced by 10 bp in the mutant 
construct and second, it has an intrinsically weak 
promoter that supports minimal independent engagement 
of RNA polymerase. Taken together, these results show 
that proximity of two Dev boxes is essential for maximal 
binding and gene activation. Our results also suggest the 
importance of proper helical phasing in cooperative 
binding. 
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Figure 4. Computational analysis of Dev boxes belonging to representative Classes I-IV promoters using sequence walker. Primary and secondary 
boxes are in black and white, respectively. Extended sites (E) are in gray. The individual information {R/) in bits is shown above each box. The height 
of the letters is the information content in bits (upper edge at +2 bits and lowest edge at —4 bits), and represents the contribution of each base to the 
conservation of the sequence. 



DevR regulon genes are temporally regulated under 
hypoxic conditions 

The relation between promoter classes and their activation 
response was assessed next by a temporal analysis of 27 
DevR regulon promoters controUing the expression of 
more than 43 genes of the regulon (arranged in operons) 
using GFP reporter assay over a period of 7 days. The 
standing hypoxia model was used wherein standing of 
aerobic cultures creates a gradual local hypoxic environ- 
ment in settled bacteria. This model is widely used to 
understand the hypoxic response of Mtb (19,31,35-37). 
Considering the variations in binding site arrangements, 
it is not surprising that the DevR regulon promoters 



are not synchronously induced, but rather their activation 
occurs in a temporal fashion (Table 2 and Supplementary 
Figure S7). All the induced genes were placed in three 
groups based on their temporal response; two promoters 
of Classes II and III were induced earliest at 4 h [hspX and 
Rvl738, 'early'), 16 promoters from Classes I, II and III 
were induced at 6/8 h ('intermediate', including Rv3134c, 
which are divergent to the 'early' induced genes) and five 
'late' promoters were induced at 12/24 h. Four proinoters, 
namely, Rv0572c, Rvl812c, Rvl734c and Rv3126c, were 
not hypoxia inducible and our observations with regard 
to these four promoters are consistent with some previous 
reports (9,31,35,38). For most of the genes, hypoxic 
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123456789987654321 
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Figure 5. Nucleotides critical for DevR binding. EMSA was carried out with DevR (2.3 and 4.6 |iM final concentration) and 18-bp oligonucleotide 
carrying WT or mutant narK2 PI Dev box. The table shows the percent DevR binding to mutant oligonucleotides relative to DevR binding with wild 
type imrK2 PI Dev box (100%). 



induction peaked at 72-120 h in this assay set up (Table 2 
and Supplementary Figure S7). Those genes that were 
induced early were also the most highly induced and this 
may reflect their importance for an effective dormancy 
response (Table 3). 

Importantly, the induction of hspX Sind Rvl738 precedes 
that of devRS (transcribed from Rv3134c promoter) and 
suggests that their induction merely requires activation of 
existing DevR molecules, and no new synthesis is neces- 
sary (see 'Discussion' section). The other genes of the 
regulon are induced along with or after induction of the 
Rv3134c promoter. This inducible promoter directs the 
synthesis of DevRS (from the Rv3134c-devRS operon) 
under hypoxia and it is autoregulated (19). Thus, the in- 
duction and sustained expression of the regulon can be 
explained by the increase in the intracellular levels of 
DevR due to positive autoregulation. 

Strikingly, the temporal expression or the magnitude of 
induction of regulon genes does not appear to be depend- 
ent entirely on the number of DevR binding sites or 
their arrangement in the promoter regions (Table 3). For 
example, between Rv0569 and Rv0574c which are both 
Class 1 promoters, the former is strongly induced early 
and strongly (at 6h and 212-fold), while the latter is in- 
duced late and modestly (at 24 h and 3.5-fold, Table 2). 
Another example is that of the Class II hspX-Rv2032 pro- 
moters which share three binding sites in their intergenic 
promoter region. While the former is induced early and 
strongly (at 4h and ~300-fold), the latter is induced later 
at 8h and to ~72-fold. To determine whether temporal 
regulation is related to the differential affinity of DevR for 
various target sites, gel-shift assays were performed with 



some temporally induced promoters (Supplementary 
Figure S8). We find that DevR interacts with promoters 
of all the groups (early, intermediate and late) at nearly 
similar protein concentrations (50-100 nM), suggesting 
that the temporal induction of regulon genes is not 
Hkely to be a sole attribute of the affinity of DevR for 
target promoters and likely involves other factors (see 
'Discussion' section). 



DISCUSSION 

This study aimed at obtaining a comprehensive under- 
standing of the DevR regulon transcriptional response. 
Although the role of DevR in mediating activation of a 
few target gene promoters is quite well characterized 
(19-21), regulatory control of most DevR-regulated 
genes remains vastly underexplored. Using the comple- 
mentary approaches of DevR-DNA interaction studies, 
computational analysis and temporal measurements of 
gene expression, we defined the properties of the DevR 
regulon activation response in Mtb. By gel-shift analysis 
of Dev boxes of varying lengths, we defined a DevR 
binding site to be at least 18-bp long. By DNase I foot- 
printing of all regulon promoters, we mapped hitherto 
unknown additional binding sites in most promoters. By 
computational analysis, we defined the logos for primary 
and secondary binding sites. By binding and expression 
studies, we showed that a minimum of two properly 
spaced binding sites in helical phase are required for 
optimum induction. Finally, we sought to determine the 
relationship between binding sites and the timing and 
magnitude of the activation response. 
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Figure 6. Adjacent location and helical phasing of Dev boxes are essential for maximal induction. (A) Schematic representation tgsl-Rv3131 
divergent promoter. (B) DNase I footprinting with DevR and WT, +5, +10 and +15 DNA. (C) GFP reporter assay of WT and altered spacer 
variant promoter constructs (+5, +10 and +15). Shown are the average values (± standard deviation) of GFP fluorescence (RFU/OD) from three 
experiments each performed in triplicate. 



Consensus sequences are thought to present a somewhat 
misleading view of binding sites as they frequently fail to 
identify genuine binding sites or predict sites where there 
are none (39). This is because they represent prominent 
bases at each position that are often absent or less prom- 
inent in the natural context. The sequence logo on the 
other hand allows plotting the conservation across all 
the positions in the set of ahgned binding sites and 
provides a quantitative measure for affinity of the binding 
site. The total information (i?sequence) for a perfect Dev 
box is 22.29 and interestingly, such a sequence is not 



present in the entire Mtb genome. The G4, G5 and C7 
nucleotides are highly conserved (> 1 bit) in both halves 
of the binding site (Figure 3) and their functional rele- 
vance was established by gel-shift analysis. Taking 
together the sequence logo and the results of previous 
studies (11,19-21), we can say that G3, G4, G5 and C7 
bases in both strands of the box are the most critical nu- 
cleotides which are recognized by 'direct' read out mech- 
anism (22). Furthermore, among these four bases, G5 and 
C7 were most important and mutating them individually 
reduced DevR-DNA interaction by ~90% (Figure 5). 
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Table 2. Temporal expression of DevR regulon promoters (up to 1 68 h) 



Genes Class' Hours 
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Data of a representative experiment out of three experiments (wliose results vary <10%) is shown. 

The heat map (green to red) indicates zero expression to maximum expression as measured by GFP fluorescence. 

DevR regulon genes are classified into three groups based on the time when induction was first observed: "early', 

'intermediate' and 'late' genes (in blue shades). 

''Classes I, IV, etc. refer to promoter architecture (see Table 1). 

''Reporter GFP fluorescence in RFU/OD units (see 'Materials and Methods' section for details). 
ND, not determined. 



The simultaneous mutation of two critical nucleotides in 
either of the two halves of the palindrome abolished 
binding completely. Interestingly, on mutating the nucleo- 
tide at ninth position, DevR-DNA interaction was com- 
pletely abolished although this position is not highly 
conserved but usually occupied by A or T nucleotide 
(Figure 3). This nucleotide is shown not to interact directly 
with DevRc (22), and it is possible that these nucleotides 
are recognized by 'indirect' read out mechanism as ob- 
served in case of E. coli cAMP receptor protein (CRP) 
where TiAg is recognized by 'indirect' read out mechanism 
and replacement of TiAg in CRP binding site with CiGg 
causes an 80-fold reduction in CRP affinity by increasing 
the free energy required to bend the DNA (40,41). Further 
studies in this regard would help to understand the 
underlying mechanisin of the binding defect observed 
here. 



This study reveals several novel and universal features 
of DevR-mediated activation. DevR binding to the pri- 
mary sites is mediated priinarily by the strength of 
DevR-DNA interaction. We show that secondary sites 
are ubiquitous at known DevR regulon promoters and 
contrary to previous predictions, none of studied target 
promoters feature a single Dev box. The protection of 
the secondary site appears to be cooperative since it is 
severely reduced by distancing it from the primary site 
(this study) or by destroying the neighboring primary 
site (20,21). Thus, binding to the secondary sites can be 
seen as being less dependent on DNA sequence and highly 
dependent on cooperative interactions with neighboring 
DevR molecules. Some of the promoters were observed 
to have three or even more binding sites suggesting the 
Hkehhood of highly cooperative interactions between as 
many DevR dimer molecules. Previously, we have shown 
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Table 3. Induction profile of DevR regulon promoters (at 72 h) 



Gene name Class Probable gene function Fold No. of Cloned 

induction Dev boxes promoter 

(72 h)" coordinates'' 
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2 
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to 
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Triacylglycerol synthase 
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2 




-143 


to 


+45 


Rv3131 
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Probable nitroreductase 


178 


2 
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to 


+38 
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I 


USP 
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2 




-348 


to 


+90 
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III 
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4 
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to 
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3 
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to 


-1 
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2 
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(C) Late genes (induced at Mjl/^h) 
















Rvl813c 


IV 


CHP 


62 




-E 


-351 


to 


+ 18 


Rv0571c 


I 


CHP 


4 


2 




-178 


to 


+90 


Rv0574c 


I 


CHP 


3.5 


2 




-183 


to 


+ 10 


Rvl996 


I 


USP 


28 


2 




-140 


to 


+ 19 


Rv2006lotsB 


I 


Probable trehalose-6-phosphate phosphatase 


7.5 


2 




-145 


to 


+77 


(D) Genes not induced 
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IV 
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"Fold induction at 72 h is calculated from data of a representative experiment out of three independent experiments and results of which deviate 
<10%. 

''Refers to promoter coordinates with respect to their corresponding putative translational start sites, all constructs are transcriptional fusions with 
respect to GFP. 

USP, universal stress protein; CHP, conserved hypothetical protein; HP, hypothetical protein. 



at some DevR targets that cooperative interaction of 
DevR with two or more sites synergistically activates 
gene transcription (19-21). These and the present studies 
collectively suggest that cooperative binding and synergis- 
tic activation could be the universal approach used by 
DevR to bring about maximal induction of the regulon. 

All DevR regulon promoters were placed into four 
classes based on the configuration of binding sites. The 
major category. Class I, includes 14 promoters (controll- 
ing ~25 genes); each having two Dev boxes in its promoter 
region in tandem arrangement. This arrangement of the 
binding sites is functionally important because insertion of 
5 or 15 bp between the boxes resulted in absence of S site 
binding at the tgsl-Rv3131 Class 1 promoter. Cooperative 
interaction is apparently facilitated by spacing of the bind- 
ing sites at a distance that permits interaction with DevR 
dimer molecules on the same face of the DNA double 
helix. Supportive evidence comes from the partial restor- 
ation of DevR binding to the S site in the tgsl-Rv3131 
promoter variant carrying a 10-bp in-phase insertion 
between P and S sites. The layout of the Rv2005-otsB 
promoter resembles the tgsl-Rv3131 promoter and they 



could be regulated in a similar way. Thus, the salient 
features of Class I regulon promoters can be summarized; 
two binding sites are essential, they are tandemly placed to 
maintain them on the same face of DNA, a primary 
binding site with the conserved sequence is essential and 
considerable degeneracy is tolerated in the secondary site 
sequence provided that G4, C7 and G12 are conserved in it. 

In Classes II, III and IV promoters that contain three or 
even four binding sites sometimes in a complex arrange- 
ment, a primary site binds to DevR and facilitates the 
cooperative binding of additional DevR molecules to the 
remaining sites even if they are poorly conserved and have 
low information content. As the information content is 
correlated with binding energy, it is Ukely that binding 
of DevR to tandem sites of varying affinity is mutually 
stabilized through protein-protein interactions between 
bound molecules as seen in the case of Rv3134c, narK2 
and tgsl promoters (19-21). These cooperative inter- 
actions are expected to decrease the total energy for 
DevR binding and may be a regulatory strategy for effi- 
cient gene induction. Other examples of Class II pro- 
moters, namely hspX and Rv2032, are likely to be 
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regulated like narK2-Rvl738 promoters (20), which was 
shown to contain an additional fourth site, SI 
(Figure 2C, Supplementary Figure S4). Along similar 
lines, it is possible that an additional degenerate binding 
site may be located in the 17-bp region between PI and S 
sites in the hspX-Rv2032 intergenic region (Supplementary 
Figures SI and S3, panel B). Rvl733c promoter is excep- 
tional as it contains an isolated Dev box (PI) placed 
63-bp upstream of the two tandem Dev boxes (S-P2). It 
would be interesting to investigate the role of the isolated 
Dev box in transcription, especially since it is placed ~6 
helical turns apart and suggests the possibility of DNA 
looping mediated by DevR interaction with the PI and 
downstream binding sites. Class IV promoters include 
Rvl813c, Rvl734c and Rv3127 and have a complex struc- 
ture that includes at least one binding site (E) with very 
low to poor information content in an extended foot- 
printed region, E (Table 1). Interestingly, and consistent 
with previous observations (31), Rvl734c was not induced 
under hypoxic conditions (Table 2). Possible reasons may 
be the absence of an active promoter or blocking of tran- 
scription due to DevR bound at the E site. Other genes, 
namely, Rv0572, Rvl812c and Rv3126c are also not 
induced under hypoxia (Tables 2 and 3, Supplementary 
Figure S7) and this property is consistent with the 
absence of Dev box-like sequences in their upstream 
regions. 

Temporal analysis of promoter activity has provided 
valuable insights into DevR regulon activity and also 
raised intriguing questions. The genes to be reproducibly 
induced the earliest (at 4 h in this model) were hspX and 
Rvl738. Notably, their induction preceded that of devRS 
(transcribed from the Rv3134c promoter). The rapid acti- 
vation of these genes implies a very early role for them 
during initiation of the dormancy response. The pattern of 
gene induction observed in the present study highhghts 
important aspects of the DevR regulon response. First, 
mild hypoxia is adequate to trigger the DevRS-DosT sig- 
naling pathway and to activate DevR (through phosphor- 
ylation). Second, the induction of hspX and Rvl738 is 
mediated by the activation of existing DevR molecules 
(from basal-level expression) and not those produced 
post-induction The presence of a high density of binding 
sites including two primary sites most likely enables co- 
operative binding of DevR at basal protein concentrations 
and very early recruitment of RNA polymerase to bring 
about rapid gene induction. Third, the subsequent activa- 
tion of other promoters including Rv3134c-devRS suggests 
that a higher level of DevR~P is required for this 
temporal program. This requirement can be met by 
positive autoregulation which results in an increase of 
intracellular DevR protein concentration under inducing 
conditions (13,19,36). Temporal expression suggests that 
target genes may require a threshold level of DevR~P for 
activation and therefore their expression profile is 
switch-like; transcription is absent below this threshold 
and induced above it. The occurrence of a threshold 
response has been suggested by a recent study where we 
showed that DevR regulon activation and hypoxia adap- 
tation is compromised by diversion of the activating 
phosphosignal from DevR (13). Thus, although 



'adequate' intracellular DevR~P levels are crucial to 
switching the regulon genes 'on', the lack of a direct rela- 
tionship between the nature and number of binding sites 
on one hand and timing and magnitude of the activation 
response on the other is striking. 

We show that DevR regulon genes are temporally and 
differentially regulated but temporal regulation does not 
appear to correlate completely and exactly with the 
affinity of DevR for target genes. Thus, promoters 
appear to bind DevR with equivalent affinity 
(Supplementary Figure S8), yet show differences in 
temporal expression (Table 2 and Supplementary Figure 
S7). The absence of a clear relationship between binding 
sites and the induction response suggest us to propose that 
the temporal nature of the induction response is not a sole 
attribute of the affinity of DevR for various target pro- 
moters and may involve other factors such as variation in 
intrinsic promoter strengths, participation of other cis 
elements and /ra«.y-acting factors, or proteins other than 
DevS/DosT that may influence the phosphorylation (acti- 
vation) state of DevR. For example, the involvement of 
some negative cis/trans regulators was suggested for 
modulating narK2 expression (20,42). Furthermore, 
recent studies have suggested that DevR regulon expres- 
sion could be fine tuned through MprA, PhoP and PknH 
(42^5). However, the exact molecular mechanisms 
involved are yet unknown. Such a complex interplay of 
multiple regulators is well understood in the case of CRP 
in E. coli wherein co-regulators hke MelR, AraC, CytR 
and GalR modulate the expression of CRP-controlled 
promoters (46^9). Complex interactions between 
multiple regulators at individual promoters could explain 
the lack of correlation between CRP-dependent promoter 
activity and the quahty of the CRP binding site (50). It 
would be exciting and relevant to decipher the apparently 
complex regulation of the DevR regulon. 
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